Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VG RNA issue #2828

Open
RenzoTale88 opened this issue Jun 1, 2020 · 23 comments · Fixed by #2852
Open

VG RNA issue #2828

RenzoTale88 opened this issue Jun 1, 2020 · 23 comments · Fixed by #2852

Comments

@RenzoTale88
Copy link

Good morning,
I'm trying to add an annotation to a cactus graph converted through hal2vg using vg rna.
I've downloaded the annotation for one of the two genomes used to generate the graph. I've renamed the chromosomes as genome.chrN to keep it consistent with the cactus naming, extracted the transcript in the file and tried to add them with the following command:

vg rna -e -n annotation.transcripts.gff mygraph.vg > mygraph.annot.vg

However, when I run it, i get the following error:

Crash report for vg v1.24.0 "Montieri"
Stack trace (most recent call last):
#14   Object "/vg/bin/vg", at 0x4ddbb9, in _start
#13   Object "/vg/bin/vg", at 0x1c39fc8, in __libc_start_main
#12   Object "/vg/bin/vg", at 0x40a627, in main
#11   Object "/vg/bin/vg", at 0xa1a067, in vg::subcommand::Subcommand::operator()(int, char**) const
#10   Object "/vg/bin/vg", at 0xa152fd, in main_rna(int, char**)
#9    Object "/vg/bin/vg", at 0xaab94c, in vg::get_input_file(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void (std::istream&)>)
#8    Object "/vg/bin/vg", at 0xa13a53, in std::_Function_handler<void (std::istream&), main_rna(int, char**)::{lambda(std::istream&)#2}>::_M_invoke(std::_Any_data const&, std::istream&)
#7    Object "/vg/bin/vg", at 0xcb9ae3, in vg::Transcriptome::add_transcript_splice_junctions(std::istream&, gbwt::GBWT*)
#6    Object "/vg/bin/vg", at 0xcb8fce, in vg::Transcriptome::parse_transcripts(std::istream&) const
#5    Object "/vg/bin/vg", at 0xcb81e8, in vg::Transcriptome::add_exon(vg::Transcript*, std::pair<int, int> const&, vg::PathIndex const&) const [clone .constprop.1505]
#4    Object "/vg/bin/vg", at 0xb8369a, in vg::PathIndex::find_position(unsigned long) const
#3    Object "/vg/bin/vg", at 0x1c3dee1, in __assert_fail
#2    Object "/vg/bin/vg", at 0x1c3de6b, in __assert_fail_base
#1    Object "/vg/bin/vg", at 0x1c4a890, in abort
#0    Object "/vg/bin/vg", at 0x11fc257, in raise

Am I doing something wrong with that?
Thank you
Andrea

@jonassibbesen
Copy link
Contributor

Hi Andrea,

It seems to assert in the path index when trying to find the node position of an exon boundary on a chromosome path. My best bet is that the position is higher than the chromosome path length. Would it be possible for you to share the error message that vg rna gave when crashing? That would help me figure out which assert specifically it is crashing on. You can get the path lengths using vg paths -E -v mygraph.vg, if you want to confirm that the exon boundaries are within the path. If that is not the issue would it then be possible for you to share your graph and annotation?

Thanks,

Jonas

@RenzoTale88
Copy link
Author

RenzoTale88 commented Jun 2, 2020

Hi @jonassibbesen
So, trying with the following command:

vg rna -p -y exon -e -n annotation.exons.gff mygraph.tmp2.vg > mygraph.final.vg

I get the following error:

[vg rna] Parsing graph file ...
[vg rna] Graph parsed in 18.022 seconds, 3.21414 GB
[vg rna] Adding novel exon boundaries and splice-junctions to graph ...
vg: src/path_index.cpp:367: vg::PathIndex::iterator vg::PathIndex::find_position(size_t) const: Assertion `position - starts_next->first < node_length(starts_next)' failed.
ERROR: Signal 6 occurred. VG has crashed. Run 'vg bugs --new' to report a bug.
Stack trace path: /tmp/vg_crash_iVnvw0/stacktrace.txt
Please include the stack trace file in your bug report!

With this stack trace:

Crash report for vg v1.24.0 "Montieri"
Stack trace (most recent call last):
#14   Object "/vg/bin/vg", at 0x4ddbb9, in _start
#13   Object "/vg/bin/vg", at 0x1c39fc8, in __libc_start_main
#12   Object "/vg/bin/vg", at 0x40a627, in main
#11   Object "/vg/bin/vg", at 0xa1a067, in vg::subcommand::Subcommand::operator()(int, char**) const
#10   Object "/vg/bin/vg", at 0xa152fd, in main_rna(int, char**)
#9    Object "/vg/bin/vg", at 0xaab94c, in vg::get_input_file(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void (std::istream&)>)
#8    Object "/vg/bin/vg", at 0xa13a53, in std::_Function_handler<void (std::istream&), main_rna(int, char**)::{lambda(std::istream&)#2}>::_M_invoke(std::_Any_data const&, std::istream&)
#7    Object "/vg/bin/vg", at 0xcb9ae3, in vg::Transcriptome::add_transcript_splice_junctions(std::istream&, gbwt::GBWT*)
#6    Object "/vg/bin/vg", at 0xcb8fce, in vg::Transcriptome::parse_transcripts(std::istream&) const
#5    Object "/vg/bin/vg", at 0xcb81e8, in vg::Transcriptome::add_exon(vg::Transcript*, std::pair<int, int> const&, vg::PathIndex const&) const [clone .constprop.1505]
#4    Object "/vg/bin/vg", at 0xb8369a, in vg::PathIndex::find_position(unsigned long) const
#3    Object "/vg/bin/vg", at 0x1c3dee1, in __assert_fail
#2    Object "/vg/bin/vg", at 0x1c3de6b, in __assert_fail_base
#1    Object "/vg/bin/vg", at 0x1c4a890, in abort
#0    Object "/vg/bin/vg", at 0x11fc257, in raise

Does this help?
I'm also trying to regenerate the graph cactus alignments and the vg graph, since the genomes are relatively small.
Thanks for the quick reply
Andrea

EDIT: I've tried regenerating the cactus alignments, convert it to graph through hal2vg and annotate it with vg rna. The issue persists. I've also checked the lengths of the paths and it seems that there are no exons falling outside the path lengths. I'm working on a way to share the data with you

@jonassibbesen
Copy link
Contributor

Thank you, Andrea. It was the assert I expected. However, given that it is not the path length, I am not sure why this is happening. It would therefore be great if you could share the data.

Best,

Jonas

@jonassibbesen
Copy link
Contributor

I just thought of another thing that could also result in the same issue. To test that I would only need the annotation. Would it be possible to share that?

@RenzoTale88
Copy link
Author

Sure, what email address can I use to share the file?

@jonassibbesen
Copy link
Contributor

jsibbese@ucsc.edu

@RenzoTale88
Copy link
Author

@jonassibbesen I've just sent the email with the link to the file of interest. Let me know, when you can look at that, if you have any issues accessing the file.
Thank you again for the help

Andrea

@jonassibbesen
Copy link
Contributor

Thank you for reminding me. I have found the issue and it is a problem with how vg rna deals with variants around the annotation boundaries, which does not work proper when the annotation start on the first position of a contig. Will push a fix later today. Sorry, for it taking so long.

@jonassibbesen
Copy link
Contributor

Hi, I have merged a fix that should hopefully resolve the issue. Let me know if you are still having problems.

@RenzoTale88
Copy link
Author

Hi
I've just tried with VG version 1.25.0, and I still get some errors:

Crash report for vg v1.25.0 "Apice"
Stack trace (most recent call last) in thread 134762:
#11   Object "", at 0xffffffffffffffff, in
#10   Object "/vg/bin/vg", at 0x1d0e45e, in __clone
#9    Object "/vg/bin/vg", at 0x12349da, in start_thread
#8    Object "/vg/bin/vg", at 0x1c1008e, in execute_native_thread_routine
#7    Object "/vg/bin/vg", at 0xb5b896, in vg::Transcriptome::construct_edited_transcript_paths_callback(std::__cxx11::list<vg::EditedTranscriptPath, std::allocator<vg::EditedTranscriptPath> >*, std::mutex*, int, std::vector<vg::Transcript, std::allocator<vg::Transcript> > const&) const
#6    Object "/vg/bin/vg", at 0xb5a843, in vg::Transcriptome::project_transcript_embedded[abi:cxx11](vg::Transcript const&, bool) const
#5    Object "/vg/bin/vg", at 0x15e1b84, in bdsg::HashGraph::for_each_step_on_handle_impl(handlegraph::handle_t const&, std::function<bool (handlegraph::step_handle_t const&)> const&) const
#4    Object "/vg/bin/vg", at 0xb5810a, in std::_Function_handler<bool (handlegraph::step_handle_t const&), handlegraph::BoolReturningWrapper<vg::Transcriptome::project_transcript_embedded(vg::Transcript const&, bool) const::{lambda(handlegraph::step_handle_t const&)#1}, handlegraph::step_handle_t, void>::wrap({lambda(handlegraph::step_handle_t const&)#1} const&)::{lambda(handlegraph::step_handle_t const&)#1}>::_M_invoke(std::_Any_data const&, handlegraph::step_handle_t const&)
#3    Object "/vg/bin/vg", at 0x1c7dba1, in __assert_fail
#2    Object "/vg/bin/vg", at 0x1c7db2b, in __assert_fail_base
#1    Object "/vg/bin/vg", at 0x1c8a550, in abort
#0    Object "/vg/bin/vg", at 0x1239c57, in raise

Thanks again for your help
Andrea

@jonassibbesen
Copy link
Contributor

Hi Andrea, I am not sure why this is happening. Would it be possible for you to share the data you used?

@jonassibbesen jonassibbesen reopened this Jul 3, 2020
@RenzoTale88
Copy link
Author

Hi @jonassibbesen , sorry for the late reply. Sure I'll arrange to share the graph I'm using and the annotation. However, it might take me a few days, is that alright for you?
Thanks again for the support anyway!

@RenzoTale88
Copy link
Author

Hi @jonassibbesen sorry for my delay. I got the files ready to share, is it ok to send them through email?

@jonassibbesen
Copy link
Contributor

No worries. Yes, that works.

@RenzoTale88
Copy link
Author

@jonassibbesen let me know if you receive the link and if you can access the files.

@jonassibbesen
Copy link
Contributor

@RenzoTale88 Thanks, got the annotation. Would it be possible for you to also share the graph used?

@RenzoTale88
Copy link
Author

@jonassibbesen I've just shared a link to download the files to reproduce the error. Let me know if you receive it.

@jonassibbesen
Copy link
Contributor

I received it. Thanks!

@jonassibbesen
Copy link
Contributor

@RenzoTale88 I found the issue and it is due to vg rna not correctly handling exons that either start or end in a cycle in the graph. I am working on a fix now, but it might be a couple of days before it is finished.

@RenzoTale88
Copy link
Author

@jonassibbesen thanks so much, yes that's fine! Just post here when I can start building vg.

Andrea

@jonassibbesen
Copy link
Contributor

jonassibbesen commented Sep 15, 2020

Quick update. Still working on this. I am running into some issues when projecting onto cycles, but I think I am close to solving it. I have it working for adding transcripts and splice-junction to the graph, but not for projecting onto other embedded path in the graph. If you are not interested in the latter I can push a PR later today that you can use.

@jonassibbesen
Copy link
Contributor

Just merged a PR that adds support for cycles on exon borders. It now runs to completion on your graph. Let me know if you run into any other problems.

@RenzoTale88
Copy link
Author

@jonassibbesen thanks so much, I'll try it as soon as I can!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants