Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VERY slow performance with fastq.gz files #229

Closed
spolson opened this issue Jan 31, 2023 · 3 comments
Closed

VERY slow performance with fastq.gz files #229

spolson opened this issue Jan 31, 2023 · 3 comments

Comments

@spolson
Copy link

spolson commented Jan 31, 2023

I recently ran several different assemblies through Racon (ver 1.4.3) with very lengthy execution times. In reviewing these I notice that almost of the time was spent in "loading sequences" (nearly 12 hours in many of my runs). I decided to first decompress the fastq file with gzip (took ~25 minutes) and reran. The "loading sequences" took less than 6 minutes.

An example below, but I have numerous others with comparable issues. In this case the read file was 53GB gzipped and 126GB uncompressed (nodes had 1TB of RAM and nothing else executing):

gzipped fastq

racon -u -t 48 {HiFi.fastq.gz} {minimap2.sam} {genome.fasta} > polished.fasta

[racon::Polisher::initialize] loaded sequences 43007.108622 s

unzipped fastq

racon -u -t 48 {HiFi.fastq} {minimap2.sam} {genome.fasta} > polished.fasta

[racon::Polisher::initialize] loaded sequences 571.343952 s
@rvaser
Copy link
Collaborator

rvaser commented Jan 31, 2023

Please use a newer version (from https://github.com/lbcb-sci/racon or bioconda). The parsing was fixed from v1.4.4.

Best regards,
Robert

@rvaser rvaser closed this as completed Jan 31, 2023
@msikic
Copy link
Collaborator

msikic commented Jan 31, 2023 via email

@spolson
Copy link
Author

spolson commented Feb 1, 2023

Thanks for the reply and sorry for the confusion on my part.

When I come to the page the only indication it has moved seems to be in the "About" section and isn't a distinct paragraph, so it's very easy to overlook (if there's some other indication, I am missing it). Repo's that I have seen with similar situations (moved but wanted to retain continuity in original location) have put large messages at the top of the readme and have then placed the repo into Archive mode.

In any case, thanks for the work your team has done in maintaining this tool. It is much appreciated!

Screenshot 2023-01-31 at 7 35 14 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants