Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
FASTQ file ended prematurely #291
I am using the cutadapt plugin in QIIME2 to trim primers from paired end reads. For most samples it works fine, but one sample is returning an error and I can't seem to track down the source of the problem. Any help would be appreciated.
Below is the command in QIIME2 along with resulting output:
ERROR: Traceback (most recent call last):
cutadapt: error: FASTQ file ended prematurely
Plugin error from cutadapt:
Command '['cutadapt', '--cores', '2', '--error-rate', '0.1', '--times', '1', '--overlap', '3', '-o', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-q57zg2fe/P91_Y16_B_284_L001_R1_001.fastq.gz', '-p', '/tmp/q2-CasavaOneEightSingleLanePerSampleDirFmt-q57zg2fe/P91_Y16_B_285_L001_R2_001.fastq.gz', '--front', 'GGACTACHVGGGTWTCTAAT', '-G', 'GTGCCAGCMGCCGCGGTAA', '/tmp/qiime2-archive-kptfm8np/464bb15a-a936-48d4-8420-6f1249e567f9/data/P91_Y16_B_284_L001_R1_001.fastq.gz', '/tmp/qiime2-archive-kptfm8np/464bb15a-a936-48d4-8420-6f1249e567f9/data/P91_Y16_B_285_L001_R2_001.fastq.gz']' returned non-zero exit status 1
See above for debug info.
I will need to make the error message better, but from what I can tell, the problem is that the second FASTQ file
Are you sure that
Both files are from the same dataset, each containing 42275 reads.
The last four lines of P91_Y16_B_284_L001_R1_001.fastq.gz are:
And the last four lines of P91_Y16_B_285_L001_R2_001.fastq.gz are:
Yes, I should have remembered that the error message is different when the files come from different datasets.
Can you make the entire dataset available to me, so I can try to reproduce the problem? Privately via e-mail to email@example.com if you prefer. Cutadapt 1.15 splits the input FASTQ file into chunks so it can work on them in parallel. I wonder whether something goes wrong while creating those chunks.
If I cannot reproduce it, I’ll ask you to report this to the developers of the QIIME plugin.
Thanks a lot for reporting this; there was indeed a bug in the way in which paired-end FASTQ files are split into chunks. The last chunk of the R2 reads could under (rare) circumstances be incomplete. Fortunately, this problem is then caught by the code that parses the FASTQ file, which is why you would get the “FASTQ file ended prematurely” message (even though the file on disk is complete). (At least this is better than silently getting incorrect results.)
I’ll make a bugfix release as soon as possible.