You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a suggestion, not a bug or support request.
I tried using GFF files I downloaded from a species specific database to annotate a graph, but after using vg annotate and vg augment, vg paths -L still only listed the paths of the sequences I used to construct the graph. After some meddling in the GFF file, I realised that the tags in the attributes column were mostly lowercase, while the official GFF specification starts the tags with an uppercase letter. So I changed "name" to "Name" and suddenly I get a well annotated graph.
I know it's a problem of not correctly formatted GFF files, but maybe you could consider making vg annotate a little more flexible to be able to work with formatting errors like that as well, or give some kind of feedback to let users know that there was nothing to annotate.
The text was updated successfully, but these errors were encountered:
Hi, thank you for the suggestion. The problem is that the name of the alignments are set to an empty string when it can't find the attribute tag "Name". The graph is still augmented correctly using these, but all annotations are added as a single path with no name. For some reason it seems like the empty path is missing and added to another path in the xg index. I have made a separate issue regarding this in the xg repo (vgteam/xg#29). I will add a warning to annotate for when the attribute tag "Name" can not be found.
As an alternative you can also use vg rna for the annotate + augment pipeline. This subcommand is bit more flexible when it comes to parsing gtf/gff files. You can set the attribute tag you want used as id and the feature type you want parsed. It have been written with transcript annotations in mind, but can be used for any type of annotation. Also, if you have a set of haplotypes you can project your annotation down to each of these, creating a haplotype-specific annotation set.
This is a suggestion, not a bug or support request.
I tried using GFF files I downloaded from a species specific database to annotate a graph, but after using
vg annotate
andvg augment
,vg paths -L
still only listed the paths of the sequences I used to construct the graph. After some meddling in the GFF file, I realised that the tags in the attributes column were mostly lowercase, while the official GFF specification starts the tags with an uppercase letter. So I changed "name" to "Name" and suddenly I get a well annotated graph.I know it's a problem of not correctly formatted GFF files, but maybe you could consider making
vg annotate
a little more flexible to be able to work with formatting errors like that as well, or give some kind of feedback to let users know that there was nothing to annotate.The text was updated successfully, but these errors were encountered: