Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vg index generates Message Node::encode(): Offset 69398 too large #337

Closed
jjfarrell opened this issue May 12, 2016 · 1 comment
Closed
Assignees

Comments

@jjfarrell
Copy link

Any suggestions for this error message while vg indexing?

Node::encode(): Offset 69398 too large error

The vg file was constructed using this command line:
vg construct -r chr11.fasta >chr11.vg
WARNING: Lower case letters found during construction
Sequences may not map to this reference.

The chr11.fasta contains just the beginning of the chromosome just beyond the gamma globin genes (11:1-5280000).

vg index -x chr11.xg -g chr11.gcsa -k 11 -T -t 4 -p chr11.vg
loading graph [===================================================]100.0%
Node::encode(): Offset 69398 too large============= ] 33.3%
Node::encode(): Offset 69409 too large
Node::encode(): Offset 69399 too large
Node::encode(): Offset 69410 too large
Node::encode(): Offset 97528 too large
Node::encode(): Offset 97539 too large
Node::encode(): Offset 97529 too large
Node::encode(): Offset 97540 too large
Node::encode(): Offset 97530 too large
Node::encode(): Offset 97541 too large
Node::encode(): Offset 97531 too large
Node::encode(): Offset 97542 too large
Node::encode(): Offset 97532 too large
Node::encode(): Offset 97543 too large
Node::encode(): Offset 97533 too large
Node::encode(): Offset 97544 too large

@adamnovak
Copy link
Member

The GCSA2 index packs kmer offsets into (I believe) 10 bits, so you can't GCSA2-index a graph with nodes larger than 1024 bases.

You can use vg mod -X 1000 graph.vg > chopped.vg to chop up your graph nodes, or use the -m option in vg construct to make sure that they are small enough to begin with.

This really needs to be the default in vg construct.

@adamnovak adamnovak self-assigned this May 13, 2016
adamnovak added a commit to adamnovak/vg that referenced this issue May 13, 2016
Now `vg construct` with default options should not produce nodes too big
for `vg index` to GCSA2-index.

Closes vgteam#250. Closes vgteam#337.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants