Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences between setk and Biopython fastq-solexa conversion #48

Closed
gittaylor opened this issue Feb 9, 2015 · 2 comments
Closed

Differences between setk and Biopython fastq-solexa conversion #48

gittaylor opened this issue Feb 9, 2015 · 2 comments

Comments

@gittaylor
Copy link

Hi Heng,

I am using this toolkit to convert solexa (Illumina <1.3) to the newer format. I compared the output to the output generated by the Bio.SeqIO.convert function (Biopython toolkit) and I am seeing consistent differences of between 1-2 at low quality. Do you know why this might be happening?

Thanks,
Taylor

@tseemann
Copy link

@gittaylor This is probably due to the fact that Solexa originally used a different formula to convert probabilties to Q values, which is described here: http://en.wikipedia.org/wiki/FASTQ_format#Quality
Note how the different formulae only diverge for low qualities. I suspect the Python code is doing a "true" conversion, whereas seqtk is just offsetting the ASCII values.

@lh3
Copy link
Owner

lh3 commented Apr 6, 2016

As @tseemann said. Seqtk is unable to convert Illumina<1.3 fastq to the standard fastq.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants