-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDSL assertion error when working with graph from vg mod -M 8 #2681
Comments
@adamnovak this looks like it's related to the move to HashGraph and default serialization type? |
Oh the problem here is that we moved to a new output format for most commands (HashGraph), but since it stores everything in a hash table it has no concept of an ordering, so sorting it doesn't make sense. I think the solution here is to drop the Did this pipeline come from a wiki page that we maybe need to update? |
Oh, I wasn't aware of that, I should pay more attention to release notes! Thank you for your quick response. I think I got the pipeline from the CPANG19 course @ekg gave last year. |
I have another problem with the same pipeline, should I open a new issue for that? Neither
|
Hm. That shouldn't happen, as far as I know. I can't think of a way that It would be useful to have the stack trace files for both commands failing, and also probably the broken graph, and the original graph. Can anything load that broken graph? You could try I do manage to get an error out of the second
|
The graph doesn't seem to be broken on first glance:
Running
and this stack trace file:
The error message from
|
It looks like prune insists on building an XG internally (which it
shouldn't need to do anymore, but we never bothered to change), and
there's something wrong with our xg-building code that causes it to
fail on this graph. Maybe it is managing to process graph elements in
the wrong order.
…On 3/25/20, Sarah Pohl ***@***.***> wrote:
The graph doesn't seem to be broken on first glance: `vg stats -z
graph_mod.vg` gives me the following output:
```
nodes 1265264
edges 1644925
```
Running `vg index -x graph_mod.xg graph_mod.vg` results in this error:
```
vg: /data3/genome_graphs/vg-v1.22.0/include/sdsl/select_support_mcl.hpp:349:
sdsl::select_support::size_type sdsl::select_support_mcl<t_bit_pattern,
t_pattern_len>::select(sdsl::select_support::size_type) const [with unsigned
char t_b = 1u; unsigned char t_pat_len = 1u; sdsl::select_support::size_type
= long unsigned int]: Assertion `i > 0 and i <= m_arg_cnt' failed.
ERROR: Signal 6 occurred. VG has crashed. Run 'vg bugs --new' to report a
bug.
Stack trace path: /tmp/vg_crash_jtjaMZ/stacktrace.txt
Please include the stack trace file in your bug report!
```
and this stack trace file:
```
Crash report for vg v1.22.0 "Rotella"
Stack trace (most recent call last):
#12 Object "", at 0xffffffffffffffff, in
#11 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x7f1798, in
_start
#10 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7f93a16bb82f, in
__libc_start_main
#9 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x735074, in main
Source "src/main.cpp", line 75, in main [0x735074]
#8 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0xcb86a7, in
vg::subcommand::Subcommand::operator()(int, char**) const
| Source "src/subcommand/subcommand.cpp", line 72, in operator()
Source "/usr/include/c++/5/functional", line 2267, in operator()
[0xcb86a7]
2264: {
2265: if (_M_empty())
2266: __throw_bad_function_call();
>2267: return _M_invoker(_M_functor,
std::forward<_ArgTypes>(__args)...);
2268: }
2269:
2270: #if __cpp_rtti
#7 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0xc1d9de, in
main_index(int, char**)
Source "src/subcommand/index_main.cpp", line 551, in main_index
[0xc1d9de]
#6 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0xfd10b0, in
vg::VGset::to_xg(xg::XG&, std::function<bool
(std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&)> const&,
std::map<std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >, vg::Path, std::less<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > >,
std::allocator<std::pair<std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const, vg::Path> > >*)
Source "src/vg_set.cpp", line 212, in to_xg [0xfd10b0]
#5 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x16aa898, in
xg::XG::from_enumerators(std::function<void (std::function<void
(std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, long const&)> const&)> const&,
std::function<void (std::function<void (long const&, bool const&, long
const&, bool const&)> const&)> const&, std::function<void
(std::function<void (std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&, long const&, bool
const&, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, bool const&, bool const&)> const&)> const&,
bool, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >)
| Source "deps/xg/src/xg.cpp", line 882, in operator()
| Source "deps/xg/src/xg.cpp", line 834, in operator()
Source
"/data3/genome_graphs/vg-v1.22.0/include/sdsl/select_support_mcl.hpp", line
398, in from_enumerators [0x16aa898]
395: template<uint8_t t_b, uint8_t t_pat_len>
396: inline auto
select_support_mcl<t_b,t_pat_len>::operator()(size_type i)const ->
size_type
397: {
> 398: return select(i);
399: }
400:
401: template<uint8_t t_b, uint8_t t_pat_len>
#4 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x833552, in
sdsl::select_support_mcl<(unsigned char)1, (unsigned
char)1>::select(unsigned long) const
Source
"/data3/genome_graphs/vg-v1.22.0/include/sdsl/select_support_mcl.hpp", line
349, in select [0x833552]
346: template<uint8_t t_b, uint8_t t_pat_len>
347: inline auto select_support_mcl<t_b,t_pat_len>::select(size_type
i)const -> size_type
348: {
> 349: assert(i > 0 and i <= m_arg_cnt);
350:
351: i = i-1;
352: size_type sb_idx = i>>12; // i/4096
#3 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7f93a16c8c81, in
__assert_fail
#2 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7f93a16c8bd6, in
#1 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7f93a16d2029, in
abort
#0 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7f93a16d0428, in
raise
```
The error message from `vg prune -r graph_mod.vg > graph_prune.vg` is the
same as before, the stack trace file is only slightly different:
```
Crash report for vg v1.22.0 "Rotella"
Stack trace (most recent call last):
#12 Object "", at 0xffffffffffffffff, in
#11 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x7f1798, in
_start
#10 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7ff95080d82f, in
__libc_start_main
#9 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x735074, in main
Source "src/main.cpp", line 75, in main [0x735074]
#8 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0xcb86a7, in
vg::subcommand::Subcommand::operator()(int, char**) const
| Source "src/subcommand/subcommand.cpp", line 72, in operator()
Source "/usr/include/c++/5/functional", line 2267, in operator()
[0xcb86a7]
2264: {
2265: if (_M_empty())
2266: __throw_bad_function_call();
>2267: return _M_invoker(_M_functor,
std::forward<_ArgTypes>(__args)...);
2268: }
2269:
2270: #if __cpp_rtti
#7 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0xc9c260, in
main_prune(int, char**)
Source "src/subcommand/prune_main.cpp", line 370, in main_prune
[0xc9c260]
#6 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x16acff1, in
xg::XG::from_path_handle_graph(handlegraph::PathHandleGraph const&)
Source "deps/xg/src/xg.cpp", line 729, in from_path_handle_graph
[0x16acff1]
#5 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x16aa898, in
xg::XG::from_enumerators(std::function<void (std::function<void
(std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, long const&)> const&)> const&,
std::function<void (std::function<void (long const&, bool const&, long
const&, bool const&)> const&)> const&, std::function<void
(std::function<void (std::__cxx11::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&, long const&, bool
const&, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> > const&, bool const&, bool const&)> const&)> const&,
bool, std::__cxx11::basic_string<char, std::char_traits<char>,
std::allocator<char> >)
| Source "deps/xg/src/xg.cpp", line 882, in operator()
| Source "deps/xg/src/xg.cpp", line 834, in operator()
Source
"/data3/genome_graphs/vg-v1.22.0/include/sdsl/select_support_mcl.hpp", line
398, in from_enumerators [0x16aa898]
395: template<uint8_t t_b, uint8_t t_pat_len>
396: inline auto
select_support_mcl<t_b,t_pat_len>::operator()(size_type i)const ->
size_type
397: {
> 398: return select(i);
399: }
400:
401: template<uint8_t t_b, uint8_t t_pat_len>
#4 Object "/data3/genome_graphs/vg-v1.22.0/bin/vg", at 0x833552, in
sdsl::select_support_mcl<(unsigned char)1, (unsigned
char)1>::select(unsigned long) const
Source
"/data3/genome_graphs/vg-v1.22.0/include/sdsl/select_support_mcl.hpp", line
349, in select [0x833552]
346: template<uint8_t t_b, uint8_t t_pat_len>
347: inline auto select_support_mcl<t_b,t_pat_len>::select(size_type
i)const -> size_type
348: {
> 349: assert(i > 0 and i <= m_arg_cnt);
350:
351: i = i-1;
352: size_type sb_idx = i>>12; // i/4096
#3 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7ff95081ac81, in
__assert_fail
#2 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7ff95081abd6, in
#1 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7ff950824029, in
abort
#0 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7ff950822428, in
raise
```
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#2681 (comment)
|
Is there anything I can do? Wothout pruning I'm running out of disk space. The graph was generated with minimap2 and seqwish, and I converted it to vg with |
It definitely needs to be pruned... I'm not sure there's a workaround, besides cutting the xg generation out of Is there a way you can share the graph? It would help a lot to be able to figure out exactly what it is doing to upset the xg builder. |
Yes, sure, I put the graph in my Google Drive. I also have online protocols for the graph creation and my tries to map reads to it. |
OK, I'm able to replicate the issue. The immediate problem here is that the graph being fed to xg generation contains empty nodes (nodes with "" for a sequence). These are generally not supposed to be there, and the XG format AFAIK can't actually represent them. It looks like they are being created because vg/src/algorithms/remove_high_degree.cpp Lines 8 to 25 in d95bd06
and the semantics of This allows us to end up with a A workaround is to remove all the paths from the graph before using
However, obviously then there are no paths at all, even if not all paths wold have been broken by the removal of the high-degree nodes. Another workaround would be to skip the removal of the high degree nodes altogether, and to just hope that the pruning step sufficiently reduces the complexity of the graph. |
Thank you! Leaving the removal of high degree nodes out has worked, I now have the xg and the gcsa index. |
Just a quick comment on the avoid-xg workaround up in #2681 (comment): Even though we can represent empty nodes in the non-xg formats, I'm pretty sure there are several places in the code where they will lead to undefined results. Definitely want to nip these in the bud, so the new error message in xg is very helpful, and should probably be ported across implementations. |
@adamnovak I think removing a node on a path is documented as undefined behavior in libhandlegraph, since it's not really clear what the right thing to do with the path would be. I think the algorithm should probably be made path-aware if we want to keep it. |
It looks like actually prohibiting empty nodes in all our handle graph implementations breaks |
I think the only sensible thing to do is to destroy the whole path
auto-magically; I'm trying to implement that right now. I don't like
leaving it as undefined behavior, because then to recover from it the
user is going to have to destroy the paths anyway, and in cases like
this algorithm it doesn't even know paths exist.
…On 3/31/20, Jordan Eizenga ***@***.***> wrote:
@adamnovak I think removing a node on a path is documented as undefined
behavior in libhandlegraph, since it's not really clear what the right thing
to do with the path would be. I think the algorithm should probably be made
path-aware if we want to keep it.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#2681 (comment)
|
And IMHO it shouldn't need to know that paths exist. Otherwise you
can't safely use a plain DeletableHandleGraph for anything. Nobody wants
a type with operations that are just sometimes undefined behavior and
you can't know when.
…On 3/31/20, Adam Novak ***@***.***> wrote:
I think the only sensible thing to do is to destroy the whole path
auto-magically; I'm trying to implement that right now. I don't like
leaving it as undefined behavior, because then to recover from it the
user is going to have to destroy the paths anyway, and in cases like
this algorithm it doesn't even know paths exist.
On 3/31/20, Jordan Eizenga ***@***.***> wrote:
> @adamnovak I think removing a node on a path is documented as undefined
> behavior in libhandlegraph, since it's not really clear what the right
> thing
> to do with the path would be. I think the algorithm should probably be
> made
> path-aware if we want to keep it.
>
> --
> You are receiving this because you were mentioned.
> Reply to this email directly or view it on GitHub:
> #2681 (comment)
|
Destroying the path seems sensible to me. I know @ekg has some other ideas about leaving "hidden handles" around to preserve paths when nodes have been removed. Also, I think it might be possible to rewrite the code around |
1. What were you trying to do?
I updated from vg 1.19.0 to 1.22.0 and wanted to continue working with my genome graph, after I ran into problems during or after mapping reads to that graph, as posted here and here.
2. What did you want to happen?
When I tried mapping to the graph and indices I already had, I got warning that I was using an out-of-date XG format, so I decided to create new indices, like this:
3. What actually happened?
The first line already didn't work. The
vg mod
commands are fine, butvg sort
crashes right away with the following message:4. If you got a line like
Stack trace path: /somewhere/on/your/computer/stacktrace.txt
, please copy-paste the contents of that file here:5. What data and command can the vg dev team use to make the problem happen?
I can put the graph somewhere to download if necessary.
I also tried creating the graph again to see if that made a difference, but I made it with minimap2+seqwish and only used
vg view
to convert to vg format and doing that again did not resolve the issue.6. What does running
vg version
say?The text was updated successfully, but these errors were encountered: