Skip to content

Chain artefact resolution

Yury V. Malovichko edited this page May 7, 2026 · 1 revision

Certain artefacts of whole-genome alignment and alignment chaining may interfere with projecting the aligned reference isoforms to the query genome. The most widespread type of aberrant projections resulting from these artefacts are the readthrough (“chimeric”) projections, in which different exons of the reference isoform are aligned to the respective exons of independent gene copies in the query. Chimeric projections are most prevalent in the tandem gene duplication loci in the query and might significantly confound the gene cluster annotation, although sometimes they lump together query genes located at significant distance from each other. Chimeric projections often display distinctive hallmarks that TOGA2 uses to resolve the ties and salvage as many true query genes copies as possible:

  1. In the simplest case, chimeric projections consist of precise reference exons’ alignment to the query homologs, with alignments to different gene copies connected by unusually long introns. If two exons’ search space in the query are divided by sequence 5 times longer than the counterpart reference exon and longer than 500 kilobases, the projection is split into two portions corresponding to the sequence up- and downstream to the intron. The portion having a greater aligned sequence fraction in the chain is considered to represent the orthologous alignment. Exons corresponding to the other portion are considered to be missing from the chain, with their search space annotated by extending the projection’s search space.
  2. Less often, chimeric projections chain together portions of the same reference exon aligned to the different homologous exons in the query. In the alignment chains, this manifests as large intra-exonic insertions. If an exon’s projection to the query via the chain is at least five times longer than its length in the reference, the projection is split into two portions corresponding to the exons up- and downstream to the focal exon. The partitioning logic is similar to that in the case of intronic insertions. The exon whose alignment caused to the chimeric projection annotation is annotated based on its part aligned at the same side from the insertion as the portion with the greater aligned fraction.

Clone this wiki locally