Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted index makes hisat2 run forever #33

Closed
sklages opened this issue Apr 13, 2016 · 2 comments
Closed

Corrupted index makes hisat2 run forever #33

sklages opened this issue Apr 13, 2016 · 2 comments

Comments

@sklages
Copy link

sklages commented Apr 13, 2016

Hi,

we created a GRCh37 index (with some extra MHC haplotype sequences included). There was no "eye-catching" error and we went on aligning or datasets with hisat2 against this index.
These jobs were running on one single cpu core (-p 12 was used) "forever".
strace was just dumping read(6, "", 4096) = 0 ... and obviously nothing happened ...
So we took a closer look at the index. There were two files missing (5+6).

  • In case of an error you could make the error message more "eye-catching" during or after building the index. You could even remove the incomplete index to make people aware that there was sth. serious going wrong.
  • the alignment process (or the wrapper hisat2) should probably check if the index is complete and not corrupted.

Great tool though, .. just my 2p ;-)
Sven

@infphilo
Copy link
Collaborator

Thanks for the detailed info!

First I need to reproduce the problem so that I can know where to put warning or error messages. How did you get those extra MHC (or HLA) sequences?

@sklages
Copy link
Author

sklages commented Apr 18, 2016

We are using Ensembl's GRCh37, ftp://ftp.ensembl.org/pub/grch37/release-84/fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.dna.toplevel.fa.gz.
Here I extracted all main chromosomes plus HSCHR6_MHC* sequences (7x).
I ran hisat2-build with --ss and --exon. As it successfully created the index when I ran hisat2-build on a "himem machine", I assume my colleague ran out of memory on a less potent server. Nevertheless he wasn't aware of any error occuring in the build process... so it was probably not very obvious ;-)

@infphilo infphilo closed this as completed May 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants