Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read clipping using clip_r1, clip_r2, three_prime_clip_r1, three_prime_clip_r2 disabled in 3.10 #944

Closed
jlorent opened this issue Feb 14, 2023 · 4 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@jlorent
Copy link
Contributor

jlorent commented Feb 14, 2023

Description of the bug

Setting clipping parameters does not seem to trim reads. Here is an example:

  • I run nextflow run nf-core/rnaseq -r 3.10 -profile test,singularity --outdir output
    I check the Sequence Length Distribution in FastQC (trimmed) and the max length is 100.

  • then I add --clip_r1 17 and run nextflow run nf-core/rnaseq -r 3.10 -profile test,singularity --outdir output --clip_r1 17
    the Sequence Length Distribution in FastQC (trimmed) is also 100 (whereas R1 reads are expected to be 17nt shorter). The trimgalore report (trimgalore/RAP1_IAA_30M_REP1_1.fastq.gz_trimming_report.txt) does not mention anything about trimming by 17 bp.

  • Finally, I run the same as just above but with version 3.9: nextflow run nf-core/rnaseq -r 3.9 -profile test,singularity --outdir output --clip_r1 17
    Here, the Sequence Length Distribution in FastQC (trimmed) is 84 for R1 and 100 for R2 as expected. The trimgalore report mentions: "All Read 1 sequences will be trimmed by 17 bp from their 5' end to avoid poor qualities or biases"

Mattias Zepper figured out (see discussion on Slack) that

In pipeline version 3.9, the module still contained the lines:

// Clipping presets have to be evaluated in the context of SE/PE
def c_r1 = params.clip_r1 > 0 ? "--clip_r1 ${params.clip_r1}" : ''
def c_r2 = params.clip_r2 > 0 ? "--clip_r2 ${params.clip_r2}" : ''
def tpc_r1 = params.three_prime_clip_r1 > 0 ? "--three_prime_clip_r1 ${params.three_prime_clip_r1}" : ''
def tpc_r2 = params.three_prime_clip_r2 > 0 ? "--three_prime_clip_r2 ${params.three_prime_clip_r2}" : ''

which went missing in version 3.10 of the pipeline.

This seem to have been removed from the module and not added back in the pipeline.

Command used and terminal output

No response

Relevant files

No response

System information

Nextflow 22.10.4
ran tests locally and tried to trim real data on HPC (slurm) with the same results.
Container: singularity
OS: Linux Ubuntu
nf-core/rnaseq v3.9 and v3.10

@Bin-Xia
Copy link

Bin-Xia commented Mar 12, 2023

I came across the same issue.

@maxulysse
Copy link
Member

This issue has been fixed in dev, a release is coming soon, but you can currently use the dev branch if you want

@MatthiasZepper
Copy link
Member

@Bin-Xia: As Ido Tamir pointed out in another issue, one can also work around this. Either use a custom config for the process or trim directly with the aligner --extra_star_align_args "--clip5pNbases 23,4"

@maxulysse: Is there a point in cherry-picking Julies fix and making a fixed release for the 3.10 branch also? No version is entirely free of bugs, but that parameters are simply ignored to me is a rather serious issue.

@drpatelh
Copy link
Member

Fixed in #952

We will try to get a release out in the next week or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants