Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter empty FastQ files after adapter trimming #292

Closed
svarona opened this issue Mar 2, 2022 · 2 comments
Closed

Filter empty FastQ files after adapter trimming #292

svarona opened this issue Mar 2, 2022 · 2 comments
Labels
enhancement Improvement for existing functionality
Milestone

Comments

@svarona
Copy link
Contributor

svarona commented Mar 2, 2022

Description of feature

We suggest the introducction of a new checkpoint after fastp trimming simmilar to the one existing after the primer trimming:

pass = WorkflowIllumina.getFastpReadsAfterFiltering(json) > 0

to test wether the output trim.fastq.gz is empty or not. We could add it after this peace of code:

@svarona svarona added the enhancement Improvement for existing functionality label Mar 2, 2022
@poojasgupta
Copy link

I think I am facing a similar issue which might get fixed after addition of the feature request suggested by @svarona.
Sometimes, when my input fastq files are empty or very small due to failed sequencing or becomes empty after the fastp trimming step, the pipeline fails at the fastqc step. When this happens and I remove the failed sample, the pipeline completes successfully.
Here is the error message that I get.

Error executing process > 'NFCORE_VIRALRECON:ILLUMINA:FASTQC_FASTP:FASTQC_TRIM (220110_SDSDN16_UT_A01290_220117)'

Caused by:
  Process `NFCORE_VIRALRECON:ILLUMINA:FASTQC_FASTP:FASTQC_TRIM (220110_SDSDN16_UT_A01290_220117)` terminated with an error exit status (1)

Command executed:

  [ ! -f  220110_SDSDN16_UT_A01290_220117.fastq.gz ] && ln -s 220110_SDSDN16_UT_A01290_220117.trim.fastq.gz 220110_SDSDN16_UT_A01290_220117.fastq.gz
  fastqc --quiet --threads 6 220110_SDSDN16_UT_A01290_220117.fastq.gz
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_VIRALRECON:ILLUMINA:FASTQC_FASTP:FASTQC_TRIM":
      fastqc: $( fastqc --version | sed -e "s/FastQC v//g" )
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Failed to process 220110_SDSDN16_UT_A01290_220117.fastq.gz
  java.io.EOFException
  	at java.base/java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:269)
  	at java.base/java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:259)
  	at java.base/java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:165)
  	at java.base/java.util.zip.GZIPInputStream.(GZIPInputStream.java:80)
  	at java.base/java.util.zip.GZIPInputStream.(GZIPInputStream.java:92)
  	at uk.ac.babraham.FastQC.Utilities.MultiMemberGZIPInputStream.(MultiMemberGZIPInputStream.java:37)
  	at uk.ac.babraham.FastQC.Sequence.FastQFile.(FastQFile.java:80)
  	at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:106)
  	at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
  	at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:159)
  	at uk.ac.babraham.FastQC.Analysis.OfflineRunner.(OfflineRunner.java:121)
  	at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)

@drpatelh drpatelh added this to the 2.5 milestone Jul 8, 2022
@drpatelh drpatelh changed the title Check number of reads in fastp output trim.fastq files Filter empty FastQ files after adapter trimming Jul 8, 2022
drpatelh added a commit to drpatelh/nf-core-viralrecon that referenced this issue Jul 8, 2022
drpatelh added a commit to drpatelh/nf-core-viralrecon that referenced this issue Jul 11, 2022
@drpatelh
Copy link
Member

Fixed in a1f810b 6a0bc31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement for existing functionality
Projects
None yet
Development

No branches or pull requests

3 participants