Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MVP data appear to be phred+64 #60

Closed
wasade opened this issue Jan 10, 2017 · 1 comment
Closed

MVP data appear to be phred+64 #60

wasade opened this issue Jan 10, 2017 · 1 comment

Comments

@wasade
Copy link
Member

wasade commented Jan 10, 2017

The metadata.yaml following the import of the raw sequences, and the metadata.yaml following demux, indicate the data are phred 33. However, that does not appear to be accurate as the character set used includes characters defined outside of phred 33 encoding (e.g., "["):

$ funzip sequences.fastq.gz | head
@HWI-EAS440_0386:1:23:17547:1423#0/1
TACGNAGGATCCGAGCGTTATCCGGATTTATTGGGTTTAAAGGGAGCGTAGATGGATGTTTAAGTCAGTTGTGAAAGTTTGCGGCTCAACCGTAAAATTGCAGTTGATACTGGATATCTTGAGTGCAGTTGAGGCAGGGGGGGATTGGTGTG
+
hhhdHddddddddfehhfhhhghggfhhhfhhgggfhhgfgdfcfhehfdgfhggfggfggffgddfgdffdgdaagaaddcbdccc]a^ad__a]_____ba_`a`__^__\]^OWZR\Z\\WYTZ_U^BBBBBBBBBBBBBBBBBBBBBB

I checked with @gregcaporaso and he indicated that it made sense given the age of the data.

At the time of creation of this issue, I do not believe there is a functional impact with the inaccuracy of metadata.yaml as quality scores are not interrogated explicitly by Q2 in this tutorial.

@jairideout
Copy link
Member

Ouch! Good catch, I ported the issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants