Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: "The file provided does not have the proper fastq format" or hanging when supplying gzipped files #66

Closed
jfy133 opened this issue Jun 24, 2024 · 3 comments

Comments

@jfy133
Copy link

jfy133 commented Jun 24, 2024

I wanted to test nonpareil with the new gzip functionality, however I'm encountering a variety of errors.

When using the bioconda recipe (which is the same as the previous, just with the new code and a zlib dependency), I get the following error on all files I test

$ nonpareil -s ERX5474932_ERR5766176_1.fastq.gz -T kmer -f fastq -b output
Nonpareil v3.5.1
 [      0.0]   The file ERX5474932_ERR5766176_1.fastq.gz.enve-tmp.158173 was created
 [      0.0]  reading ERX5474932_ERR5766176_1.fastq.gz.enve-tmp.158173
 [      0.0]  Picking 10000 random sequences
 [      0.0]  Started counting
Fatal error:
The file provided does not have the proper fastq format
 [      0.0] Fatal error: The file provided does not have the proper fastq format

So I went to try the compiled version you included on the release here on GitHub,

$ wget https://github.com/lmrodriguezr/nonpareil/releases/download/v3.5.1/nonpareil-3.5.1-Linux_x86_64
$ chmod +x nonpareil-3.5.1-Linux_x86_64

and while it works if I uncompress the file (uncompressd to 'test.fastq')

$ ./nonpareil-3.5.1-Linux_x86_64 -s test.fastq -T kmer -f fastq -b output
Nonpareil v3.5.1
 [      0.0]  reading test.fastq
 [      0.0]  Picking 10000 random sequences
 [      0.0]  Started counting
 [      0.1]  Read file with 632060 sequences
 [      0.1]  Average read length is 151.000000bp
 [      0.1]          Worker 0 @start_samples.
 [      0.1]  Sub-sampling library
 [      0.2]          Worker 0 @start_checkings.                      
 [      0.2]  Evaluating consistency                              
 [      0.2]  Everything seems correct
 [      0.2]          Worker 0 @exit.

It just hangs forever. on the following...

$ ./nonpareil-3.5.1-Linux_x86_64 -s ERX5474932_ERR5766176_1.fastq.gz -T kmer -f fastq -b output
Nonpareil v3.5.1

The two test files I tried this on I've placed on dropbox here, which are valid FASTQ files as I use them for a vairety of pipelines I use.

Note that in all cases empty tmp files are generated, e.g. with:

e.g. for the three most recent (Failed) tests:

-rw-rw-r-- 1 james james    0 Jun 24 10:12 ERX5474932_ERR5766176_1.fastq.gz.enve-tmp.160124
-rw-rw-r-- 1 james james    0 Jun 24 10:16 ERX5474932_ERR5766176_1.fastq.gz.enve-tmp.161707
-rw-rw-r-- 1 james james    0 Jun 24 10:17 ERX5474932_ERR5766176_1.fastq.gz.enve-tmp.162390
@jfy133 jfy133 changed the title Error: "The file provided does not have the proper fastq format" when supplying gzipped files Error: "The file provided does not have the proper fastq format" or hanging when supplying gzipped files Jun 24, 2024
@lmrodriguezr
Copy link
Owner

Thank you for all the testing @jfy133 , and hopefully this is the last of the faulty releases. Apologies for that. Please feel free to reopen if the new version doesn't work for you (I'll be creating a release very soon).

@jfy133
Copy link
Author

jfy133 commented Jun 28, 2024

That appears to work better now :D noted someething else (will make a separate issue), but it appears to run without erroring now. I'll look into updating bioconda

@jfy133
Copy link
Author

jfy133 commented Jun 28, 2024

Actually nevermind, @martin-g beat me too it :D

bioconda/bioconda-recipes#48783

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants