Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
The ASQG format describes an assembly graph. Each line is a tab-delimited record. The first field in each record describes the record type. The three types are:
HT- Header record. This record contains metadata tags for the file version (VN tag) and parameters associated with the graph (for example the minimum overlap length).
VT- Vertex records. The second field contains the vertex identifier, the third field contains the sequence. Subsequent fields contain optional tags.
ED- Edge description records. The second field describes a pair of overlapping sequences. A full description of this field is below. Subsequent fields contain optional tags.
Tags follow the same format as SAM
HT VN:i:1 ER:f:0 OL:i:45 IN:Z:reads.fa CN:i:1 TE:i:0 VT read1 GATCGATCTAGCTAGCTAGCTAGCTAGTTAGATGCATGCATGCTAGCTGG VT read2 CGATCTAGCTAGCTAGCTAGCTAGTTAGATGCATGCATGCTAGCTGGATA VT read3 ATCTAGCTAGCTAGCTAGCTAGTTAGATGCATGCATGCTAGCTGGATATT ED read2 read1 0 46 50 3 49 50 0 0 ED read3 read2 0 47 50 2 49 50 0 0
The second field of ED records describe an overlap between a pair of sequences. This field contains 10 elements which are:
- sequence 1 name
- sequence 2 name
- sequence 1 overlap start (0 based)
- sequence 1 overlap end (inclusive)
- sequence 1 length
- sequence 2 overlap start (0 based)
- sequence 2 overlap end (inclusive)
- sequence 2 length
- sequence 2 orientation (1 for reversed with respect to sequence 1)
- number of differences in overlap (0 for perfect overlaps, which is the default).