Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running giraffe step #4135

Closed
yunhanajing opened this issue Oct 26, 2023 · 5 comments
Closed

Error running giraffe step #4135

yunhanajing opened this issue Oct 26, 2023 · 5 comments

Comments

@yunhanajing
Copy link

yunhanajing commented Oct 26, 2023

$vg construct -r Sus_scrofa.chr18.fa -v chr18sv.vcf.gz -a >chr18sv.vg
$vg index chr18sv.vg -x chr18sv.xg -L
$vg autoindex --workflow giraffe -r Sus_scrofa.chr18.fa -v chr18sv.vcf.gz -p mysv
$vg giraffe -Z mysv.giraffe.gbz -m mysv.min -d mysv.dist -f SRR7791362_1.fastq.gz -f SRR7791362_2.fastq.gz > mysv.gam  ---the error step

1. What were you trying to do?
I want to use vg to genotype the pig structural variants (SV).

2. What actually happened?
However, when I reached the 'giraffe' step, I encountered an issue. Although no error message is displayed, the initial run is very fast. After that, it becomes stuck and doesn't make any progress until it eventually stops. My SRR7791362_1.fastq.gz file is approximately 30GB in size. I'm unsure if this is due to the fastq file being too large. I was able to complete the initial steps without encountering any error messages

vg 1.50
@jltsiren
Copy link
Contributor

How long did it take for Giraffe to run? Did it print anything? (You can also rerun Giraffe with option -p and report the output.) How large is the output mysv.gam file? If the output is non-empty, you can try running vg stats -a mysv.gam to get some basic statistics about the alignments.

Also, it looks like you are trying to align whole-genome reads to a single-chromosome graph. The result will not be particularly sensible. Many reads from other chromosomes will also align to that chromosome, but with more errors or with a mapping quality estimate that is too high.

@yunhanajing
Copy link
Author

I used 6 threads and the timestamps Thu Oct 26 12:44:00 and Thu Oct 26 12:52:40 both show it ran for 32 minutes. Shortly after, the job status is "C". File "mysv.gam" size is 3.1GB

I also tried the following commands, but encountered the same issue:

$ vg construct -r Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -v allsv.vcf.gz -a > allsv.vg
$ vg index allsv.vg -x allsv.xg -L
$ vg autoindex --workflow giraffe -r Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -v allsv.vcf.gz -p mysv
$ vg giraffe -Z mysv.giraffe.gbz -m mysv.min -d mysv.dist -f SRR7791362_1.fastq.gz -f SRR7791362_2.fastq.gz > mysv.gam ---the error step

During the initial testing phase, I used chr18 for testing. After extracting and compressing only 7M fastq, there was no error.

@jltsiren
Copy link
Contributor

So you get a GAM file that is smaller than expected. What does vg stats -a mysv.gam output?

What does Giraffe print when you map the reads with option -p?

@yunhanajing
Copy link
Author

After adding the -p option, the gam file expanded to 25GB. Unfortunately, the job still terminated prematurely. I utilized qsub for job submission, but noticed that no giraffe.sh.e593742 and giraffe.sh.o593742 files were generated. Additionally, when I ran the command vg stats -a mysv.gam, I encountered the following issue:
[E::bgzf_read_block] Failed to read BGZF block data at offset 26799640687 expected 16999 bytes; hread returned 12159
terminate called after throwing an instance of 'std::runtime_error'
what(): [vg::io::MessageIterator] obsolete, invalid, or corrupt input at message 1756340176354273 group 1756333797839874
━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.50.1 "Monopoli"
Stack trace (most recent call last) in thread 25400:
#14 Object "", at 0xffffffffffffffff, in
#13 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x21472f3, in __clone
#12 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x20a09ba, in start_thread
#11 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x204329d, in gomp_thread_start
#10 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0xded18b, in void vg::io::for_each_parallel_implvg::Alignment(std::istream&, std::function<void (vg::Alignment&, vg::Alignment&)> const&, std::function<void (vg::Alignment&)> const&, std::function<bool ()> const&, unsigned long) [clone ._omp_fn.0]
#9 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x5d47ef, in vg::io::MessageIterator::takeabi:cxx11 [clone .cold]
#8 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x205a49d, in _Unwind_Resume
#7 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x205991b, in _Unwind_RaiseException_Phase2
#6 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x1f960d9, in __gxx_personality_v0
#5 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x20305e8, in __cxa_call_terminate
#4 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x1f9698b, in __cxxabiv1::__terminate(void (*)())
#3 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x5e29e3, in __gnu_cxx::__verbose_terminate_handler() [clone .cold]
#2 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x5e512b, in abort
#1 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x20757d5, in raise
#0 Object "/Lustre01/anaconda3/envs/vg/bin/vg", at 0x20a21dc, in __pthread_kill
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!

@jltsiren
Copy link
Contributor

This sounds like an issue in with the environment you are running Giraffe in. Maybe the process exceeds some memory / running time bound you have configured, and the environment terminates it early.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants