Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import of fastq files from ENA get assigned fastqsanger instead of fastqsanger.gz #6900

Open
lparsons opened this issue Oct 19, 2018 · 4 comments

Comments

@lparsons
Copy link
Contributor

I believe this is due to the addition of the fastqsanger.gz filetype. The ENA is "assigning" a filetype of fastqsanger (which used to work) and Galaxy is accepting that, and even shows a correct "peek" of the fastq file in the history. However, tools (e.g. FastQC) run against the file will fail, complaining about the format (e.g. line does start with the @ character).

This isn't really a Galaxy bug per se, but it is an issue with the Galaxy experience for users.

@mvdbeek
Copy link
Member

mvdbeek commented Oct 19, 2018

I think we should probably disregard filetypes sent by external parties at this point. Seems we'd be better off relying on our sniffers.

@lparsons
Copy link
Contributor Author

I'd be in favor of an additional flag to force override of sniffers. That way servers that aren't updated (ENA) would get the new behavior, but things that want to be specific still could. The main issue I see is that fastq sniffers generally assign type "fastq" and not "fastqsanger", rendering the files useless without a completely pointless Fastq Groomer run. Unless that behavior has changed?

@martenson
Copy link
Member

We are sniffing fastqsanger since #4237 (does not cover all cases obviously)

@mvdbeek
Copy link
Member

mvdbeek commented Oct 19, 2018

Yes, we sniff fastqsanger if the quality values are compatible with sanger encoding. We will also soon have a colorspace sniffer (but that data isn't much used anymore). Everything else will be flat fastq, as the Illumina and Solexa variants are not easy to discriminate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants