New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cutadapt installed via conda igzip error for some fastq files #513
Comments
This should actually be an issue on the https://github.com/pycompression/xopen repository, but I seem to be unable to transfer the issue there, so let’s keep the discussion here. @rhpvorderman You may also want to have a look. @zxl124, @andreott Since I haven’t been able to reproduce this so far, would one of you be able to supply me with a fastq.gz file that causes the error? The error comes from the A multi-block/concatenated gzip is created from the concatenation of multiple
which would give you
This avoids having to recompress everything, but some programs reading
Support for reading concatenated |
As a workaround, I believe you can either
|
Thanks so much for pointing out the package culpable for this problem. I compared my environments with working and problematic |
Great, that is very helpful. Is it possible for you to share one of the files that failed with me? It is not a problem if not. I understand this might not be possible for sensitive or secret data. |
I've sent you an email with a link to the fastq file. Thank you. |
Interesting. Thanks @zxl124 . Do you mind if @marcelm links me the fastq as well? EDIT: Or if you send me the link, that would be great too. Version 2.29 of isa-l can't handle concatenated gzips properly. This bug was found thanks to the xopen test suite. It was fixed in 2.30.0. The isa-l test suite now also has a test for it. Strange that the new build causes errors. I will take a look at it. |
Okay some extra information as I am also the isa-l-feedstock maintainer on conda:
Some stuff I did to try reproduce the issue:
It must be something special with the gz file, but I cannot debug further if I do not have it. I wonder if it has something to do with NULL bytes. NULL bytes are allowed between two concatenated gz files. It could also have to do something with the header. It would be nice to be able to reproduce the issue. |
I have received the file that @zxl124 sent me and was strangely enough not able to reproduce the problem. I used the exact same build of the Conda package ( I have also tried to reproduce the issue by decompressing all @zxl124 Can you try to decompress the file with |
This is interesting. I was indeed unable to reproduce this on linux (Ubuntu 18.04.3). I forgot to mention when I ran into this problem I was doing everything with docker, using
Error message:
The result to using
Also tried a newer version, |
@zxl124 Thanks for mentioning this.
So basically it is just an almost pure debian buster container. Weird. I also run debian buster as my OS. Maybe it is debian related? Can you send the file to me? |
@rhpvorderman Just sent you an email with the link to the fastq |
I can reproduce the issue on Debian buster. Not container related. |
This has proably to do with the compiler move (as announced https://conda-forge.org/docs/user/announcements.html#announcements). A bit strange it does not work on debian though. I will check this out. |
I could also reproduce on Debian Buster. I noticed that the file is actually bgzipped:
In case it helps, I was able to create a much smaller reproducer (33 MiB) that I can share publicly (some SRA data I had floating around): |
I can confirm that build |
@zxl124 Thanks for reporting back. Can this issue be closed now? Or do you have some other unexpected errors with regards to this issue? |
Yes. Thank you so much for fixing this. |
Only very recently (~2 weeks ago), cutadapt installed via conda has the following error:
This only happens with some fastq files, only in multi-thread mode (with
-j
specified), and only with conda-installed cutadapt. I've tried version 3.2 and 3.1. I've tried rolling back versions of some dependencies includingpigz=2.3.4
,xopen=1.0.1
,dnaio=0.4.4
, none of these helps.I understand this is probably a bioconda problem rather than cutadapt problem. This has also been reported in bioconda recipes repo. So far no solutions. I am hoping maybe cutadapt developers might point me to a few places since you understand the error message better. Thank you.
The text was updated successfully, but these errors were encountered: