Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncing script is not flexible with header formats #19

Closed
tamburinif opened this issue Mar 16, 2019 · 3 comments
Closed

syncing script is not flexible with header formats #19

tamburinif opened this issue Mar 16, 2019 · 3 comments

Comments

@tamburinif
Copy link

It appears that the syncing script cannot handle fastq headers that are not in standard illumination format and puts all reads into the orphans file. This commonly affects data downloaded from SRA, for example. We definitely need to fix this asap.

@tamburinif
Copy link
Author

edit: this works for some SRA data but not all

@bsiranosian
Copy link
Contributor

Can you post an example of headers where this fails? You could also adjust the regex on line 59 to match what your sra reads have.

@tamburinif
Copy link
Author

tamburinif commented Mar 19, 2019

It looks like any header that ends in /1 or /2 fails, which isn't great
eg: @HWUSI-EAS740_103031124:1:100:10000:11286/1

I made my own version of the script with adjusted regex which I can share if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants