Skip to content

akikuno/csvtag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Licence Test

Caution

This package is currently under development. Please use midsv until it is complete.

csvtag

csvtag is a toolkit for csv tag, a format of cs tag that supports inversion.

Specification

This is essentially the same encoding as the minimap2 cs tag, but with the one difference that lowercase letters represent inversions:

Prefix Sequence Description
= [ACGTN]+ Identical sequence (long form)
: [0-9]+ Identical sequence length
* [ACGTN][ACGTN] Substitution: ref to query
+ [ACGTN]+ Insertion to the reference
- [ACGTN]+ Deletion from the reference
~ [ACGTN]{2}[0-9]+[ACGTN]{2} Intron length and splice signal
[=+-*~] [acgtn] Inversion

Important

All csv tags are based on the forward strand of the reference sequence (SAM FLAG is 0). The reverse strand is entirely reverse complemented.

Definision of Inversion

  • Inversion detection uses RNAME, POS, and FLAG from SAM files.
    • Sort alignments by RNAME and POS.
    • If there are 2 or fewer reads for a QNAME, there is no Inversion, so output the cstag in uppercase.
    • If there are 3 or more reads for a QNAME, detect Inversion.
      • Extract three alignments in order of ascending POS (first, second, third).
      • (1) If the reads of first, second, and third are within 50 bp of each other, and only the second is reverse-oriented, then the second is determined to be an Inversion.
      • Reverse complement the cs tag of the second and output it as a csv tag in lowercase.
      • If there are gaps between first, second, and third, fill them with N.
    • Apply the same process to any adjacent reads.

Functions

  • csvtag.call(): Generate a csv tag
  • csvtag.to_sequence(): Reconstruct a query subsequence from the alignment

Usage

About

🔧Toolkit for csv tag, a format of cs tag that supports inversion.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors