Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: Read name line should start with '@' #491

Closed
qquuzhao opened this issue May 26, 2023 · 12 comments
Closed

ERROR: Read name line should start with '@' #491

qquuzhao opened this issue May 26, 2023 · 12 comments

Comments

@qquuzhao
Copy link

ERROR: Read name line should start with '@',I used fastp 0.23.3 to deal my data. But when I use fastp.0.23.0,it's not occured this ERROR。I think this version is missing to handling of these abnormal reads.Looking forward to your response.

@sfchen
Copy link
Member

sfchen commented May 26, 2023

Can you upload the data that failed?

@erinyoung
Copy link

I am so glad that someone else ran into this problem!

I have some examples because I'm running into the same issue with some paired-end SARS-CoV-2 files that I use for a lot of my testing.

They are located at:
https://raw.githubusercontent.com/StaPH-B/docker-builds/master/tests/SARS-CoV-2/SRR13957123_1.fastq.gz
https://raw.githubusercontent.com/StaPH-B/docker-builds/master/tests/SARS-CoV-2/SRR13957123_2.fastq.gz

@nh13
Copy link
Contributor

nh13 commented May 26, 2023

s10.R1.fq.gz
s10.R2.fq.gz

fastp \
        --in1 s10.R1.fq.gz \
        --in2 s10.R2.fq.gz \
        --out1 s10_1.fastp.fastq.gz \
        --out2 s10_2.fastp.fastq.gz \
        --json s10.fastp.json \
        --html s10.fastp.html \
         \
         \
         \
        --thread 2 \
        --detect_adapter_for_pe
fastp --version
fastp 0.23.3

The smoking gun is this:

$ file s10.R1.fq.gz
s10.R1.fq.gz: Blocked GNU Zip Format (BGZF; gzip compatible), block length 8883

When I decompress and recompress with gzip, it runs just fine (similarly with decompressed FASTQ).

I also decompressed and recompressed with bgzip, and it fails.

@zerobio
Copy link

zerobio commented May 30, 2023

image
I also encountered this problem. I guess whether the differences as shown out result in the error ?

@sfchen
Copy link
Member

sfchen commented May 30, 2023

@zerobio rooted it as you pointed.

I will fix and update it soon.

sfchen added a commit that referenced this issue May 30, 2023
@sfchen
Copy link
Member

sfchen commented May 30, 2023

Please try v0.23.4

@shenwei356
Copy link

Binary at http://opengene.org/fastp/fastp is not updated yet.

@sfchen
Copy link
Member

sfchen commented May 30, 2023

Yes, that will be updated tomorrow.

@sfchen
Copy link
Member

sfchen commented May 31, 2023

Pre-built binary was just updated. But the conda version is still waiting for auto bump-up.

@matthdsm
Copy link

Can confirm this issue is fixed in the latest version!

@qquuzhao
Copy link
Author

qquuzhao commented Jun 1, 2023

Thank you very much!Yes,the issue isfixed.

@qquuzhao qquuzhao closed this as completed Jun 1, 2023
@nh13
Copy link
Contributor

nh13 commented Jun 1, 2023

Confirms it works on bgzip'ed FASTQs posted above, thank-you!

wdu added a commit to wdu/fastp that referenced this issue Jun 15, 2023
- \r\n handling may cause reading a byte past end of buffer, parser fails
- checking end-of-file condition can only be reliably done
  after a call to getLine() returns NULL. One particular case
  is that some gzip files contain empty gzip blocks at the end
  of the file, which can´t be predicted by the current eof() code
  Tested with files provided in issue OpenGene#491.

This reverts commit 0ee1b3b, "fix a regression bug of FASTQ reader"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants