-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing transposition alignments. #830
Comments
These are aligned as long indels. |
Hi Heng. I understand that these are aligned as long indels, but that loses the genomic information that the "long deletion" and the "long insertion" are actually the same sequence that has relocated. In order to capture this information, it would be helpful to have separate alignments for transpostions. Is it possible to make minimap2 output such regions as separate alignments? |
You can use
I am not sure how your simulated example could arise biologically. Long duplications are usually tandem with messy boundaries and gene conversions. In my view, getting long INDELs is more often preferred. This is how the default is tuned. |
One example of translocation was reported earlier (Figure 2B). |
I was saying your short insertion-deletion in close range is unlikely. A transposon-mediated event involves an insertion, not moving sequences. Human has L1-mediated XTR between sex chromosomes ~1Mya. That happened at much larger scale and is accompanied by many smaller SVs. It will not be contained in a long alignment. Translocations also occur at arm level, not like your example. |
DNA transposons (class II transposons) work using the cut-and-paste mechanism leading to movement of DNA. These transposons are reported to be few KBp long only. I am not sure how the insertion target for these transposons is selected, but I assume that they can result in translocation in local regions. Also, I think, non-human species have more genomic rearrangements. For example, in our data, we find a region with two large inversions and a translocation adjacent to each other between two strains (Col-0 and Ler) of A. thaliana. So, in order to detect such genomic rearrangements, separate alignments would work better than alignments with large indels. |
Similar to #816 there are issues in aligning long transpositions between genomes in the current version of minimap2.
For a 50Kb transposition, output from version 2.17-r974-dirty:
whereas, output from version 2.22-r1110-dirty:
Smaller transpositions are not identified by both versions. Testing a 5Kb transposition, output from version 2.17-r974-dirty:
whereas, output from version 2.22-r1110-dirty:
It would be great is such regions can also be identified.
Test files:
reference.txt
seq_up_trans5000.fa.txt
seq_up_trans50000.fa.txt
The text was updated successfully, but these errors were encountered: