Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hs-blastn alignment failed #7

Closed
zihhuafang opened this issue Mar 5, 2020 · 27 comments
Closed

hs-blastn alignment failed #7

zihhuafang opened this issue Mar 5, 2020 · 27 comments

Comments

@zihhuafang
Copy link

zihhuafang commented Mar 5, 2020

Hi,
I am running nanovar v1.3.2 (installed via bioconda) using nanopore reads from cattle genome.

I encountered the error when nanovar ran hs-blastn:

[05/03/2020 12:54:46] - INFO - Clustering SV breakends
[05/03/2020 12:57:52] - INFO - Filtering INS and INV SVs
[05/03/2020 12:58:47] - DEBUG - [HS-BLASTN] Loading database.
[05/03/2020 12:58:47] - DEBUG - Loading /nfs/nas12.ethz.ch/fs1201/green_groups_tg_public/data/fang/pacbio_snake/ref/UCD_ref.fa.sequence, size = 3GB
[05/03/2020 12:58:54] - DEBUG - Loading /nfs/nas12.ethz.ch/fs1201/green_groups_tg_public/data/fang/pacbio_snake/ref/UCD_ref.fa.bwt, size = 3GB
[05/03/2020 12:59:04] - DEBUG - Loading /nfs/nas12.ethz.ch/fs1201/green_groups_tg_public/data/fang/pacbio_snake/ref/UCD_ref.fa.sa, size = 5GB
[05/03/2020 12:59:36] - CRITICAL - Error: hs-blastn alignment failed

I ran the hs-blastn commend alone (see below) and did not have issue.
hs-blastn align -db /nfs/nas12.ethz.ch/fs1201/green_groups_tg_public/data/fang/pacbio_snake/ref/UCD_ref.fa \ -window_masker_db UCD_ref.counts.obinary -query temp2.fa -out test.fa \ -outfmt 7 -max_target_seqs 3 -gapopen 0 -gapextend 4 -penalty -3 -reward 2

I am not sure why the pipeline failed. Could you please suggest where the problem is?
Thanks!

@cytham
Copy link
Owner

cytham commented Mar 5, 2020

Hi,

I am not sure. Can you try to run with --force to regenerate the blast index files? You can run it with the BAM file generated to save time.

Best,
cy

@zihhuafang
Copy link
Author

Hi,
Thanks for the reply.
I did run a blast index alone again and re-ran the pipeline. However, i still encountered the same error message...
Not sure how to debug this since know the log file did not contain the details of the error.

@cytham
Copy link
Owner

cytham commented Mar 5, 2020

Hi,

Did you try to run with --force?

cy

@zihhuafang
Copy link
Author

Hi,
Yes, I did.
I also tried with different working directory to start the pipeline from scratch. Still the same issue.

@cytham
Copy link
Owner

cytham commented Mar 5, 2020

Hi,

Thanks for trying.

Can you send me your log file?

cy

@zihhuafang
Copy link
Author

zihhuafang commented Mar 6, 2020

Hi,

Please find the attached file for the log.

NanoVar-050320-2235.log

@cytham
Copy link
Owner

cytham commented Mar 6, 2020

Hi,

Thanks for the log file.

I am guessing that the loading of the blast indexes might have failed, since it got terminated at that step. You mentioned that hs-blastn ran fine alone. Can you send me the output file (e.g. test.fa) and the command line output?

Also, can you try running with this smaller test reference genome? Please decompress it before using. ref_test.fa.tar.gz

Thanks
cy

@zihhuafang
Copy link
Author

zihhuafang commented Mar 6, 2020

Hi,
I attached the test.fa, but I did not keep the command line output.
I am re-running it again and will put the complete command line output here.

Here's the beginning of output:

[HS-BLASTN] Loading database.
	Loading /nfs/nas12.ethz.ch/fs1201/green_groups_tg_public/data/fang/pacbio_snake/ref/UCD_ref.fa.sequence, size = 3GB
	Loading /nfs/nas12.ethz.ch/fs1201/green_groups_tg_public/data/fang/pacbio_snake/ref/UCD_ref.fa.bwt, size = 3GB
	Loading /nfs/nas12.ethz.ch/fs1201/green_groups_tg_public/data/fang/pacbio_snake/ref/UCD_ref.fa.sa, size = 5GB
[HS-BLASTN] done. Time elapsed: 37.94 secs.


[HS-BLASTN] Processing temp2.fa.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 8000 queries.
	Processing 3459 queries.
	Processing 3524 queries.

When running hs-blastn alone, it did not look like there was an issue for loading the indexes.

Thanks for the test file. I will try it now.

test.fa.tar.gz

@zihhuafang
Copy link
Author

zihhuafang commented Mar 6, 2020

Hi,
I ran nanovar with the test reference genome, and it worked without problem.
I attached here the command line output of hs-blastn that I ran alone with my reference genome.

Thanks a lot.

hs_blastn.log

@cytham
Copy link
Owner

cytham commented Mar 6, 2020

Hi,

Like you said, running the hs-blastn alone seems fine. Thanks for sending me the logs.
I am still not sure why you are getting the error when you use your own reference genome. Could it be that it is exceeding your memory limit when hs-blastn is run in the pipeline?

cy

@zihhuafang
Copy link
Author

Hi,
Thanks a lot for checking the log files.
I am not sure that the memory was the problem since I provided more memory when running nanovar than when running hs-blastn alone (I ran them as jobs on HPC). I also ran nanovar on a login node. It was not killed, and I had the same error message...

I will look into the files generated with test reference genome to figure out what caused this issue.

Thanks a lot for looking into it. I will let you know if I find something!

@cytham
Copy link
Owner

cytham commented Mar 6, 2020

Hi,

ok, I will also test it on the HPC. Will let you know if I get any hints.

Thanks,
cy

@sidiropoulos
Copy link

Hi,
I'm experiencing the same issue. Are there any updates on this?

Thanks,
Nikos

@cytham
Copy link
Owner

cytham commented Mar 12, 2020

@sidiropoulos are you running NanoVar on the HPC too? can you please attach your log file.

Thanks,
cy

@cytham
Copy link
Owner

cytham commented Mar 17, 2020

@d2389758 any updates on this issue? I have successfully ran NanoVar on my HPC.

@zihhuafang
Copy link
Author

zihhuafang commented Mar 20, 2020

@d2389758 any updates on this issue? I have successfully ran NanoVar on my HPC.

I re-downloaded again the reference fasta file (without index) and re-ran the nanovar from scratch.
It worked, so I think the problem is using a fasta file already indexed.
Even re-run index step and resume from that step did not work.
Not sure why.

@cytham
Copy link
Owner

cytham commented Mar 20, 2020

@d2389758 I see, thanks for trying it out and I'm glad it works now. Were you running any other aligners in parallel with NanoVar using the same reference FASTA file? I am guessing that the indexes created by another aligner might have clashed with the index created by blast as they might have the same suffixes. BWA is known to override some of blast's indexes.

Is the previous FASTA file the same as the newly downloaded FASTA file? Can you check their md5sum? Indexing of a FASTA file should not alter the file.

@sidiropoulos are you able to resolve your issue?

Thanks,
cy

@zihhuafang
Copy link
Author

@cytham Interestingly, my previous fasta was indexed with minimap2. After the failed nanovar run, I re-indexed with blast alone without issue, but the same problem still persisted.

No, the new fasta was downloaded from Ensembl, whereas the old fasta was downloaded from UCSC server. I could try to download the same fasta from UCSC and re-run nanovar again for testing, but maybe later in time.

@cytham
Copy link
Owner

cytham commented Mar 20, 2020

@d2389758 minimap2 indexing neither produce any index files nor alter the FASTA file. So I am not sure how that will affect the usage of the same FASTA for NanoVar.

@cytham
Copy link
Owner

cytham commented Apr 18, 2020

I will close this for now. If you still have issues, I will open it again.

@cytham cytham closed this as completed Apr 18, 2020
@yusmiatiliau
Copy link

Hi, I tried to run nanovar using nanopore reads from human genome, but it gave me the hs-blastn index failed error. Reading through this issue page, I have tried using new reference sequence (hg38 downloaded from UCSC, as well as hg38chromosome1 from ensembl), both returned with the same error.

I installed nanovar and all dependencies according to the installation guide, and this is the command line I used to call SV: nanovar -t 4 /path to fastq file/ /path to reference fasta file/ /working directory/ -x ont.

I have tried initially using bam files generated using minimap2, but same outcome. Can you please advice what I might have done wrong?

Thank you

@cytham
Copy link
Owner

cytham commented Jun 16, 2020

@yusmiatiliau sorry that you encountered this error.

I believe this error is caused by a new version of HS-BLASTN (version 1.0.0) present only on GitHub. And I guess you install HS-BLASTN by cloning its GitHub repo?

Please try the following steps to fix it:

# Navigate to the v0.0.5 directory of HS-BLASTN
cd /path/to/queries/hs-blastn-src/v0.0.5/

# Compile v0.0.5
make

# Run NanoVar by specifying HS-BLASTN executable path using --hsb option
nanovar -t 4 /path to fastq file/ /path to reference fasta file/ /working directory/ -x ont --hsb /path/to/queries/hs-blastn-src/v0.0.5/hsblastn

Another way would be to install NanoVar using Conda, which will install v0.0.5 of HS-BLASTN.

Hope it works

@yusmiatiliau
Copy link

Hi @cytham, thanks so much for your prompt response.

Yes, I installed hs-blastn by cloning from github.
I have followed your instruction, but trying to compile v0.0.5 gave me error:

CC sources/mask_misc.cpp => objs/./sources/mask_misc.o
In file included from ./sources/mask/include/winmask/seq_masker.hpp:41:0,
from sources/mask_misc.h:4,
from sources/mask_misc.cpp:3:
./sources/mask/include/winmask/seq_masker_istat.hpp:367:10: warning: ‘template class std::auto_ptr’ is deprecated [-Wdeprecated-declarations]
std::auto_ptr< CComponentVersionInfo > fmt_version;
^~~~~~~~
In file included from /usr/include/c++/7/memory:80:0,
from ./sources/mask/mask_macros.h:63,
from ./sources/mask/include/winmask/seq_masker_window.hpp:41,
from ./sources/mask/include/winmask/seq_masker.hpp:40,
from sources/mask_misc.h:4,
from sources/mask_misc.cpp:3:
/usr/include/c++/7/bits/unique_ptr.h:51:28: note: declared here
template class auto_ptr;
^~~~~~~~
CC sources/sequence.cpp => objs/./sources/sequence.o
In file included from sources/sequence.h:5:0,
from sources/sequence.cpp:1:
sources/utility.h: In static member function ‘static int cy_utility::NString::DoubleToString(double, int, char*, int, cy_utility::TNumToStringFlags)’:
sources/utility.h:205:14: error: ‘isnan’ was not declared in this scope
else if (isnan(value))
^~~~~
sources/utility.h:205:14: note: suggested alternative:
In file included from sources/utility.h:10:0,
from sources/sequence.h:5,
from sources/sequence.cpp:1:
/usr/include/c++/7/cmath:639:5: note: ‘std::isnan’
isnan(_Tp __x)
^~~~~
Makefile:227: recipe for target 'objs/./sources/sequence.o' failed
make: *** [objs/./sources/sequence.o] Error 1
(base) cen@janice:/WORKSPACE/Cen/tools/queries/hs-blastn-src/v0.0.5$ conda deactivate
cen@janice:/WORKSPACE/Cen/tools/queries/hs-blastn-src/v0.0.5$ make
CC sources/sequence.cpp => objs/./sources/sequence.o
In file included from sources/sequence.h:5:0,
from sources/sequence.cpp:1:
sources/utility.h: In static member function ‘static int cy_utility::NString::DoubleToString(double, int, char*, int, cy_utility::TNumToStringFlags)’:
sources/utility.h:205:14: error: ‘isnan’ was not declared in this scope
else if (isnan(value))
^~~~~
sources/utility.h:205:14: note: suggested alternative:
In file included from sources/utility.h:10:0,
from sources/sequence.h:5,
from sources/sequence.cpp:1:
/usr/include/c++/7/cmath:639:5: note: ‘std::isnan’
isnan(_Tp __x)
^~~~~
Makefile:227: recipe for target 'objs/./sources/sequence.o' failed
make: *** [objs/./sources/sequence.o] Error 1

Thanks a lot!

@cytham
Copy link
Owner

cytham commented Jun 16, 2020

Hi @yusmiatiliau

That seems like a bug in line 205 of ./sources/utility.h file.

Can you replace "isnan" with "std::isnan" at line 205 of ./sources/utility.h and try to make again.

@yusmiatiliau
Copy link

Hi @cytham
Thanks so much for your kind help.
It works now!

@cytham
Copy link
Owner

cytham commented Jun 17, 2020

That's great!

@guypwhunt
Copy link

I also get this issue when running nanovar on a HPC but only when running it with multiple threads strangly. e.g if I use a single thread it avoids this error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants