Representative sequence selection for large bioinformatics datasets
-
Updated
May 27, 2026 - Python
Representative sequence selection for large bioinformatics datasets
Automated maximum-likelihood phylogeny pipeline for viral families. Discovers species via NCBI Taxonomy, downloads from GenBank, aligns with MAFFT, builds trees with FastTree (broad) and IQ-TREE (refined), and annotates internal nodes by LCA. Supports multi-marker concatenation for large DNA virus families.
Add a description, image, and links to the phyloxml topic page so that developers can more easily learn about it.
To associate your repository with the phyloxml topic, visit your repo's landing page and select "manage topics."