vg grep the none-reference sequence from the Minigrah-Cactus result？ #4193

ld9866 · 2023-12-15T07:27:30Z

Dear developer:
We currently use Minigrah-Cactus to build the pan-genome and convert it into vg format files. I would like to ask how we can extract the non-reference sequence in the pan-genome into fasta format because I used vg before, but the code was lost due to our negligence. Can you help me?

jeizenga · 2023-12-15T19:25:47Z

You can extract paths as a FASTA using vg paths --extract-fasta. I think the interface requires a GBWT as input, so you may need to pull out the GBWT from your GBZ using vg gbwt.

ld9866 · 2023-12-26T00:11:37Z

Dear developer:
We encountered some problems in vg index, showing insufficient memory, but our running memory is 1TB, I would like to ask whether our code and thinking are correct, and how should we solve this problem?
vg mod -X 256 test.full.vg > test.full.mod.vg
vg index -x test.full.xg -g test.full.gcsa -k 16 -t 8 test.full.mod.vg
error:
InputGraph::InputGraph(): Memory use of input kmers (1149.82 GB) exceeds memory limit (1024 GB)
vg gbwt -g test.full.gbwt -t 8 -x test.full.xg test.full.vg

jeizenga · 2024-01-03T20:59:40Z

Is there a reason you are doing a manual indexing pipeline instead of using vg autoindex? For most users, vg autoindex is more robust to issues like this.

It's also unclear to me which mapping tool you're planning to use. The GCSA2 index is used by vg map, but the GBWT usually is not. The GBWT is typically used in vg giraffe. This is another reason to use vg autoindex: it can determine exactly which indexes you need based on the mapping tool you want to use.

jeizenga closed this as completed Dec 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vg grep the none-reference sequence from the Minigrah-Cactus result？ #4193

vg grep the none-reference sequence from the Minigrah-Cactus result？ #4193

ld9866 commented Dec 15, 2023

jeizenga commented Dec 15, 2023

ld9866 commented Dec 26, 2023

jeizenga commented Jan 3, 2024

vg grep the none-reference sequence from the Minigrah-Cactus result？ #4193

vg grep the none-reference sequence from the Minigrah-Cactus result？ #4193

Comments

ld9866 commented Dec 15, 2023

jeizenga commented Dec 15, 2023

ld9866 commented Dec 26, 2023

jeizenga commented Jan 3, 2024