Semi-curated de-novo transposon libraries for 193 genome assemblies representing 119 Drosophilid species.
A list of all species is provided in species_list.txt. The FASTA
folder contains transposon consensus sequences for each species and the info
folder contains the following information about each transposon:
- Name
- Class/Family
- Predicted subfamily (95/80/98)
- Predicted subfamily (90/80/90)
- Predicted subfamily (80/80/80)
- env ORF present
- gag ORF present
- pol ORF present
- Count (buildSummary.pl)
- Hits (blastn hits covering at least 80 nt and 50% of query length)
- Coverage (buildSummary.pl)
- Coverage (calcDivergenceFromAlign.pl)
- Kimura divergence (calcDivergenceFromAlign.pl)
- Length
Description of how the transposon libraries were generated is available at https://github.com/susbo/Uni-strand_clusters.
The current version is 0.99. For all versions, see the releases on this repository.
This project is licensed under the GNU GPLv3 License - see the LICENSE file for details.
If you use these scripts and pipelines, please cite our preprint:
Unistrand piRNA clusters are an evolutionarily conserved mechanism to suppress endogenous retroviruses across the Drosophila genus
Jasper van Lopik, Azad Alizada, Maria-Anna Trapotsi, Gregory J. Hannon, Susanne Bornelöv, Benjamin Czech Nicholson
bioRxiv 2023.02.27.530199; doi: https://doi.org/10.1101/2023.02.27.530199