Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vg deconstruct crashes on hprc-v1.0-minigraph-grch38.vg #3960

Closed
bw2 opened this issue May 16, 2023 · 6 comments
Closed

vg deconstruct crashes on hprc-v1.0-minigraph-grch38.vg #3960

bw2 opened this issue May 16, 2023 · 6 comments

Comments

@bw2
Copy link

bw2 commented May 16, 2023

I'm trying to run vg deconstruct to generate a VCF of the differences between hg38 chr1 and CHM13.

I ran

$ wget https://s3-us-west-2.amazonaws.com/human-pangenomics/pangenomes/freeze/freeze1/minigraph/hprc-v1.0-minigraph-grch38.gfa.gz
$ vg convert <(gunzip -c hprc-v1.0-minigraph-grch38.gfa.gz) > hprc-v1.0-minigraph-grch38.vg
$ vg deconstruct --path chr1 --path-prefix 'CHM13#'  hprc-v1.0-minigraph-grch38.vg --verbose
Computed overlay in 2.30701 seconds using 2.90468 CPU seconds.
Finding snarls
Deconstructing top-level snarls
terminate called after throwing an instance of 'std::runtime_error'
  what():  Reference path must have a sample name
━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.48.0 "Gallipoli"
Stack trace (most recent call last):
#12   Object "/home/weisburd/bin/vg", at 0x5f0e3d, in _start
#11   Object "/home/weisburd/bin/vg", at 0x1ed71af, in __libc_start_main
#10   Object "/home/weisburd/bin/vg", at 0x5c0ade, in main
#9    Object "/home/weisburd/bin/vg", at 0xd3143b, in vg::subcommand::Subcommand::operator()(int, char**) const
#8    Object "/home/weisburd/bin/vg", at 0xca608e, in main_deconstruct(int, char**)
#7    Object "/home/weisburd/bin/vg", at 0xe3b4f2, in vg::Deconstructor::deconstruct(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, handlegraph::PathPositionHandleGraph const*, vg::SnarlManager*, bool, int, bool, int, bool, bool, bool, gbwt::GBWT*)
#6    Object "/home/weisburd/bin/vg", at 0x595dbf, in handlegraph::PathMetadata::create_path_name(handlegraph::PathSense const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned long const&, unsigned long const&, std::pair<unsigned long, unsigned long> const&) [clone .cold]
#5    Object "/home/weisburd/bin/vg", at 0x1e13228, in __cxa_throw
#4    Object "/home/weisburd/bin/vg", at 0x1e130c6, in std::terminate()
#3    Object "/home/weisburd/bin/vg", at 0x1e1305b, in __cxxabiv1::__terminate(void (*)())
#2    Object "/home/weisburd/bin/vg", at 0x5bd66a, in __gnu_cxx::__verbose_terminate_handler() [clone .cold]
#1    Object "/home/weisburd/bin/vg", at 0x5c0007, in abort
#0    Object "/home/weisburd/bin/vg", at 0x149611b, in raise
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Please include this entire error log in your bug report!
━━━━━━━━━━━━━━━━━━━━

Using -e and/or -a leads to the same error.

@glennhickey
Copy link
Contributor

You may have to use an older version of vg to deconstruct that graph.

The issue is that in the year(s) since that graph was made, vg has adopted some naming conventions for path names, and it looks like compatibility broke in deconstruct here. This may be fixable, but using the older release should be a work-around in the meantime.

@bw2
Copy link
Author

bw2 commented May 16, 2023

Thanks for the quick response. I just tried the older version but it crashes with this error:

$ vg
vg: variation graph tool, version v1.40.0 "Suardi"

usage: vg <command> [options]

main mapping and calling pipeline:
  -- autoindex     mapping tool-oriented index construction from interchange formats
  -- construct     graph construction
  -- rna           construct splicing graphs and pantranscriptomes
  -- index         index graphs or alignments for random access or mapping
  -- map           MEM-based read alignment
  -- giraffe       fast haplotype-aware short read alignment
  -- mpmap         splice-aware multipath alignment of short reads
  -- augment       augment a graph from an alignment
  -- pack          convert alignments to a compact coverage index
  -- call          call or genotype VCF variants
  -- help          show all subcommands

For more commands, type `vg help`.
For technical support, please visit: https://www.biostars.org/t/vg/

$ vg deconstruct --path chr1 --path-prefix 'CHM13#'  hprc-v1.0-minigraph-grch38.vg --verbose
Finding snarls
Deconstructing top-level snarls
vg: src/deconstructor.cpp:924: void vg::Deconstructor::deconstruct(std::vector<std::__cxx11::basic_string<char> >, const PathPositionHandleGraph*, vg::SnarlManager*, bool, int, bool, int, bool, bool, bool, const std::unordered_map<std::__cxx11::basic_string<char>, std::pair<std::__cxx11::basic_string<char>, int> >*, const std::unordered_map<std::__cxx11::basic_string<char>, int>*, gbwt::GBWT*): Assertion `path_to_sample_phase == nullptr || path_restricted || gbwt' failed.
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
Stack trace path: /tmp/vg_crash_CFqIie/stacktrace.txt
Please include the stack trace file in your bug report!

Does it matter that I ran the gfa => vg format conversion using vg v1.48?

@glennhickey
Copy link
Contributor

In practice, you need to run vg deconstruct with either -e or -g (or run it on a .gbz).

I'm just noticing now that you're trying to deconstruct the minigraph graph (I'd thought it was the minigraph-cactus graph). This will not work, even if you fix the invocation and/or vg version, as deconstruct needs paths to write alleles, and minigraph graphs don't contain them. To look at variants in minigraph, you need to use gfatools.

@bw2
Copy link
Author

bw2 commented May 16, 2023

thanks. Just starting to learn about genome graph tools. I didn't see a way to get variants using gfatools.

Usage: gfatools <command> <arguments>
Commands:
  view        read a GFA file
  stat        statistics about a GFA file
  gfa2fa      convert GFA to FASTA
  gfa2bed     convert rGFA to BED (requiring rGFA)
  blacklist   blacklist regions
  bubble      print bubble-like regions (EXPERIMENTAL)
  asm         miniasm-like graph transformation
  sql         export rGFA to SQLite (requiring rGFA)
  ed          GWFA prefix alignment (for evaluation only)
  version     print version number

but will post to their issue tracker.

I also tried starting with the minigraph-cactus (hprc-v1.0-mc-grch38.gfa. but couldn't get the "vg convert" command to run to completion.

As I was looking at the files again, I realized that a minigraph-cactus VCF is already available @ https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=pangenomes/freeze/freeze1/minigraph-cactus/, so I'll just use that for now.

@bw2 bw2 closed this as completed May 16, 2023
@glennhickey
Copy link
Contributor

See gfatools bubble.

To represent the topology and paths of the graph at the whole-genome scale, you must use a compressed format like gbwt. So if you want to rerun vg deconstruct on the minigraph-cactus graph you need to download the xg and gbwt and run vg deconstruct graph.xg -g graph.gbwt.

@bw2
Copy link
Author

bw2 commented May 17, 2023

Great. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants